{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Window functions\n",
    "\n",
    "General Elections were held in the UK in 2015 and 2017. Every citizen votes in a constituency. The candidate who gains the most votes becomes MP for that constituency.\n",
    "\n",
    "All these results are recorded in a table ge\n",
    "\n",
    "yr\t| firstName\t| lastName\t| constituency\t| party\t| votes\n",
    "---:|-----------|-----------|---------------|-------|------:\n",
    "2015\t| Ian\t| Murray\t| S14000024\t| Labour\t| 19293\n",
    "2015\t| Neil\t| Hay\t| S14000024\t| Scottish National Party\t| 16656\n",
    "2015\t| Miles\t| Briggs\t| S14000024\t| Conservative | 8626\n",
    "2015\t| Phyl\t| Meyer\t| S14000024\t| Green\t| 2090\n",
    "2015\t| Pramod\t| Subbaraman\t| S14000024\t| Liberal Democrat\t| 1823\n",
    "2015\t| Paul\t| Marshall\t| S14000024\t| UK Independence Party\t | 601\n",
    "2015\t| Colin\t| Fox\t| S14000024\t| Scottish Socialist Party\t| 197\n",
    "2017\t| Ian\t| MURRAY\t| S14000024\t| Labour\t| 26269\n",
    "2017\t| Jim\t| EADIE\t| S14000024\t| SNP\t| 10755\n",
    "2017\t| Stephanie Jane Harley\t| SMITH\t| S14000024\t| Conservative\t| 9428\n",
    "2017\t| Alan Christopher\t| BEAL\t| S14000024\t| Liberal Democrats\t| 1388"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Prerequesites\n",
    "from pyhive import hive\n",
    "%load_ext sql\n",
    "%sql hive://cloudera@quickstart.cloudera:10000/sqlzoo\n",
    "%config SqlMagic.displaylimit = 20"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Warming up\n",
    "\n",
    "Show the **lastName, party** and **votes** for the **constituency** 'S14000024' in 2017."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * hive://cloudera@quickstart.cloudera:10000/sqlzoo\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>lastname</th>\n",
       "        <th>party</th>\n",
       "        <th>votes</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>MURRAY</td>\n",
       "        <td>Labour</td>\n",
       "        <td>26269</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>EADIE</td>\n",
       "        <td>SNP</td>\n",
       "        <td>10755</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SMITH</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>9428</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>BEAL</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>1388</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('MURRAY', 'Labour', 26269),\n",
       " ('EADIE', 'SNP', 10755),\n",
       " ('SMITH', 'Conservative', 9428),\n",
       " ('BEAL', 'Liberal Democrats', 1388)]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT lastName, party, votes\n",
    "  FROM ge\n",
    "    WHERE constituency = 'S14000024' AND yr = 2017\n",
    "    ORDER BY votes DESC"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Who won?\n",
    "\n",
    "You can use the RANK function to see the order of the candidates. If you RANK using (ORDER BY votes DESC) then the candidate with the most votes has rank 1.\n",
    "\n",
    "**Show the party and RANK for constituency S14000024 in 2017. List the output by party**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * hive://cloudera@quickstart.cloudera:10000/sqlzoo\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>party</th>\n",
       "        <th>votes</th>\n",
       "        <th>posn</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Conservative</td>\n",
       "        <td>9428</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Labour</td>\n",
       "        <td>26269</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>1388</td>\n",
       "        <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SNP</td>\n",
       "        <td>10755</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Conservative', 9428, 3),\n",
       " ('Labour', 26269, 1),\n",
       " ('Liberal Democrats', 1388, 4),\n",
       " ('SNP', 10755, 2)]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT party, votes,\n",
    "       RANK() OVER (ORDER BY votes DESC) posn\n",
    "    FROM ge\n",
    "    WHERE constituency = 'S14000024' AND yr = 2017\n",
    "    ORDER BY party"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. PARTITION BY\n",
    "\n",
    "The 2015 election is a different PARTITION to the 2017 election. We only care about the order of votes for each year.\n",
    "\n",
    "**Use PARTITION to show the ranking of each party in S14000021 in each year. Include yr, party, votes and ranking (the party with the most votes is 1).**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * hive://cloudera@quickstart.cloudera:10000/sqlzoo\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>yr</th>\n",
       "        <th>party</th>\n",
       "        <th>votes</th>\n",
       "        <th>posn</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2015</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>12465</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2017</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>21496</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2019</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>19451</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2015</td>\n",
       "        <td>Labour</td>\n",
       "        <td>19295</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2017</td>\n",
       "        <td>Labour</td>\n",
       "        <td>14346</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2019</td>\n",
       "        <td>Labour</td>\n",
       "        <td>6855</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2015</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>1069</td>\n",
       "        <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2017</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>1112</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2019</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>4174</td>\n",
       "        <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2015</td>\n",
       "        <td>SNP</td>\n",
       "        <td>23013</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2019</td>\n",
       "        <td>SNP</td>\n",
       "        <td>24877</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2015</td>\n",
       "        <td>UKIP</td>\n",
       "        <td>888</td>\n",
       "        <td>5</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[(2015, 'Conservative', 12465, 3),\n",
       " (2017, 'Conservative', 21496, 1),\n",
       " (2019, 'Conservative', 19451, 2),\n",
       " (2015, 'Labour', 19295, 2),\n",
       " (2017, 'Labour', 14346, 2),\n",
       " (2019, 'Labour', 6855, 3),\n",
       " (2015, 'Liberal Democrats', 1069, 4),\n",
       " (2017, 'Liberal Democrats', 1112, 3),\n",
       " (2019, 'Liberal Democrats', 4174, 4),\n",
       " (2015, 'SNP', 23013, 1),\n",
       " (2019, 'SNP', 24877, 1),\n",
       " (2015, 'UKIP', 888, 5)]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT yr, party, votes,\n",
    "      RANK() OVER (PARTITION BY yr ORDER BY votes DESC) posn\n",
    "    FROM ge\n",
    "    WHERE constituency = 'S14000021'\n",
    "    ORDER BY party, yr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Edinburgh Constituency\n",
    "\n",
    "Edinburgh constituencies are numbered S14000021 to S14000026.\n",
    "\n",
    "**Use PARTITION BY constituency to show the ranking of each party in Edinburgh in 2017. Order your results so the winners are shown first, then ordered by constituency.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * hive://cloudera@quickstart.cloudera:10000/sqlzoo\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>constituency</th>\n",
       "        <th>party</th>\n",
       "        <th>votes</th>\n",
       "        <th>posn</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000021</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>21496</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000022</td>\n",
       "        <td>SNP</td>\n",
       "        <td>18509</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000023</td>\n",
       "        <td>SNP</td>\n",
       "        <td>19243</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000024</td>\n",
       "        <td>Labour</td>\n",
       "        <td>26269</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000025</td>\n",
       "        <td>SNP</td>\n",
       "        <td>17575</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000026</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>18108</td>\n",
       "        <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000021</td>\n",
       "        <td>Labour</td>\n",
       "        <td>14346</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000022</td>\n",
       "        <td>Labour</td>\n",
       "        <td>15084</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000023</td>\n",
       "        <td>Labour</td>\n",
       "        <td>17618</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000024</td>\n",
       "        <td>SNP</td>\n",
       "        <td>10755</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000025</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>16478</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000026</td>\n",
       "        <td>SNP</td>\n",
       "        <td>15120</td>\n",
       "        <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000021</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>1112</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000022</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>1849</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000023</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>15385</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000024</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>9428</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000025</td>\n",
       "        <td>Labour</td>\n",
       "        <td>13213</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000026</td>\n",
       "        <td>Conservative</td>\n",
       "        <td>11559</td>\n",
       "        <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000023</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>2579</td>\n",
       "        <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000024</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>1388</td>\n",
       "        <td>4</td>\n",
       "    </tr>\n",
       "</table>\n",
       "<span style=\"font-style:italic;text-align:center;\">24 rows, truncated to displaylimit of 20</span>"
      ],
      "text/plain": [
       "[('S14000021', 'Conservative', 21496, 1),\n",
       " ('S14000022', 'SNP', 18509, 1),\n",
       " ('S14000023', 'SNP', 19243, 1),\n",
       " ('S14000024', 'Labour', 26269, 1),\n",
       " ('S14000025', 'SNP', 17575, 1),\n",
       " ('S14000026', 'Liberal Democrats', 18108, 1),\n",
       " ('S14000021', 'Labour', 14346, 2),\n",
       " ('S14000022', 'Labour', 15084, 2),\n",
       " ('S14000023', 'Labour', 17618, 2),\n",
       " ('S14000024', 'SNP', 10755, 2),\n",
       " ('S14000025', 'Conservative', 16478, 2),\n",
       " ('S14000026', 'SNP', 15120, 2),\n",
       " ('S14000021', 'Liberal Democrats', 1112, 3),\n",
       " ('S14000022', 'Liberal Democrats', 1849, 3),\n",
       " ('S14000023', 'Conservative', 15385, 3),\n",
       " ('S14000024', 'Conservative', 9428, 3),\n",
       " ('S14000025', 'Labour', 13213, 3),\n",
       " ('S14000026', 'Conservative', 11559, 3),\n",
       " ('S14000023', 'Liberal Democrats', 2579, 4),\n",
       " ('S14000024', 'Liberal Democrats', 1388, 4),\n",
       " ('S14000025', 'Liberal Democrats', 2124, 4),\n",
       " ('S14000026', 'Labour', 7876, 4),\n",
       " ('S14000023', 'Green', 1727, 5),\n",
       " ('S14000026', 'SIR', 132, 5)]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT constituency, party, votes,\n",
    "       RANK() OVER (PARTITION BY constituency ORDER BY votes DESC) posn\n",
    "    FROM ge\n",
    "    WHERE constituency BETWEEN 'S14000021' AND 'S14000026'\n",
    "    AND yr  = 2017\n",
    "    ORDER BY posn, constituency"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Winners Only\n",
    "\n",
    "You can use [SELECT within SELECT](https://sqlzoo.net/wiki/SELECT_within_SELECT_Tutorial) to pick out only the winners in Edinburgh.\n",
    "\n",
    "**Show the parties that won for each Edinburgh constituency in 2017.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * hive://cloudera@quickstart.cloudera:10000/sqlzoo\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>constituency</th>\n",
       "        <th>party</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000021</td>\n",
       "        <td>Conservative</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000022</td>\n",
       "        <td>SNP</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000023</td>\n",
       "        <td>SNP</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000024</td>\n",
       "        <td>Labour</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000025</td>\n",
       "        <td>SNP</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>S14000026</td>\n",
       "        <td>Liberal Democrats</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('S14000021', 'Conservative'),\n",
       " ('S14000022', 'SNP'),\n",
       " ('S14000023', 'SNP'),\n",
       " ('S14000024', 'Labour'),\n",
       " ('S14000025', 'SNP'),\n",
       " ('S14000026', 'Liberal Democrats')]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT constituency, party FROM\n",
    " (SELECT constituency, party, \n",
    "  RANK() OVER (PARTITION BY constituency ORDER BY votes DESC) posn\n",
    "  FROM ge\n",
    "  WHERE constituency BETWEEN 'S14000021' AND 'S14000026'\n",
    "  AND yr  = 2017) AS a\n",
    "    WHERE a.posn=1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Scottish seats\n",
    "\n",
    "You can use **COUNT** and **GROUP BY** to see how each party did in Scotland. Scottish constituencies start with 'S'\n",
    "\n",
    "**Show how many seats for each party in Scotland in 2017.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " * hive://cloudera@quickstart.cloudera:10000/sqlzoo\n",
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>party</th>\n",
       "        <th>seats</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Conservative</td>\n",
       "        <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Labour</td>\n",
       "        <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>Liberal Democrats</td>\n",
       "        <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>SNP</td>\n",
       "        <td>34</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[('Conservative', 12), ('Labour', 9), ('Liberal Democrats', 4), ('SNP', 34)]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%sql\n",
    "SELECT party, COUNT(party) seats FROM\n",
    " (SELECT party, constituency, \n",
    "  RANK() OVER (PARTITION BY constituency ORDER BY votes DESC) posn\n",
    "  FROM ge\n",
    "  WHERE constituency LIKE 'S%' AND yr=2017) AS a\n",
    "    WHERE a.posn=1\n",
    "    GROUP BY party"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
